The Momentum Problem in MDL and Bayesian Prediction

نویسندگان

  • Tim van Erven
  • Lambertus van Erven
چکیده

Preface " Prediction is very difficult, especially about the future. " The Minimum Description Length (MDL) principle provides a powerful philosophy for learning from observations of the past [Grünwald et al., 2005; Ris-sanen, 1989]. It equates learning with compressing the observational data. As is common in science, there may be multiple contending explanations, or models, for the data. In this thesis we investigate an application of the MDL principle to prediction of the future when there are at least two such models. We will show that the regular, commonly used form of MDL can behave suboptimally and present a refinement of regular MDL that we call the Switch-Point procedure. Being based on data compression, the Switch-Point procedure may still be considered an application of the MDL principle, although it differs from the way in which MDL is usually applied. For the convenience of readers with a background in Bayesian statistics, we give an interpretation of the regular MDL procedure as an instance of Bayesian Model Averaging (BMA). As a consequence our results on MDL transfer to BMA directly. Our first contribution is to identify the momentum phenomenon, which arises when one model enables the most accurate predictions of the future given few observations of the past, but predictions based on another model become more accurate when more data are collected. Essentially, this may happen whenever the models themselves represent compound explanations. i ii Preface The momentum phenomenon will not occur, for example, if one model, M 0 , represents the conjecture that the data come from repeated tosses of a biased coin with probability 3/5 of coming up heads, and the other model, M 1 , describes the data as tosses of a coin with probability 4/7 of coming up heads. It can occur, however, if M 1 were to represent the hypothesis that the data come from a coin with unknown probability p of coming up heads. This latter model basically combines all the specific explanations " the probability of coming up heads is 4/7 " into the compound explanation " the probability of coming up heads may be any fixed value p ". The momentum phenomenon can occur, in that case, if the relative frequency of heads in the data converges to some number f , which is close to, but not equal to 3/5. If this happens, then for few observations of the past the slightly incorrect, …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the use of back propagation and radial basis function neural networks in surface roughness prediction

Various artificial neural networks types are examined and compared for the prediction of surface roughness in manufacturing technology. The aim of the study is to evaluate different kinds of neural networks and observe their performance and applicability on the same problem. More specifically, feed-forward artificial neural networks are trained with three different back propagation algorithms, ...

متن کامل

Technical Report IDSIA - 13 - 05 Asymptotics of Discrete MDL for Online Prediction ∗ Jan Poland and Marcus

Minimum Description Length (MDL) is an important principle for induction and prediction, with strong relations to optimal Bayesian learning. This paper deals with learning non-i.i.d. processes by means of two-part MDL, where the underlying model class is countable. We consider the online learning framework, i.e. observations come in one by one, and the predictor is allowed to update his state o...

متن کامل

Asymptotics of Discrete MDL for Online Prediction ∗ Jan Poland and Marcus

Minimum Description Length (MDL) is an important principle for induction and prediction, with strong relations to optimal Bayesian learning. This paper deals with learning non-i.i.d. processes by means of two-part MDL, where the underlying model class is countable. We consider the online learning framework, i.e. observations come in one by one, and the predictor is allowed to update his state o...

متن کامل

Paper Learning Bayesian Belief Networks Based on the Minimum Description Length Principle: Basic Properties

SUMMARY This paper addresses the problem of learning Bayesian belief networks (BBN) based on the minimum description length (MDL) principle. First, we give a formula of description length based on which the MDL-based procedure learns a BBN. Secondly, we point out that the diierence between the MDL-based and Cooper and Herskovits procedures is essentially in the priors rather than in the approac...

متن کامل

A Disease Outbreak Prediction Model Using Bayesian Inference: A Case of Influenza

Introduction: One major problem in analyzing epidemic data is the lack of data and high dependency among the available data, which is due to the fact that the epidemic process is not directly observable. Methods: One method for epidemic data analysis to estimate the desired epidemic parameters, such as disease transmission rate and recovery rate, is data ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006